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3 <110> APPLICANT: Rosetta Inpharmatics LLC 

4 Schadt , Eric 

5 Monks, Stephanie 

7 <120> TITLE OF INVENTION: COMPUTER SYSTEMS AND METHODS FOR ASSOCIATING GENES WITH 

TRAITS 

8 USING CROSS SPECIES DATA 
10 <130> FILE REFERENCE: 9301-210-228 

C--> 12 <140> CURRENT APPLICATION NUMBER: US/10/540 , 405 
C--> 13 <141> CURRENT FILING DATE: 2005-06-22 

15 <150> PRIOR APPLICATION NUMBER: 60/436,684 

16 <151> PRIOR FILING DATE: 2002-12-27 

18 <150> PRIOR APPLICATION NUMBER: 60/460,343 

19 <151> PRIOR FILING DATE: 2003-04-03 
21 <160> NUMBER OF SEQ ID NOS : 30 
23 <170> SOFTWARE: Patent In version 3,2 

25 <210> SEQ ID NO: 1 

26 <211> LENGTH: 1583 

27 <212> TYPE: DNA 

28 <213> ORGANISM: Mus musculus 

30 <400> SEQUENCE: 1 

31 gagctattcg gcctctctag gccggcgggt cctccgctcc atggtcctgt ctgtcagcgc 60 
33 tgtgtcagga ggccagtgcc gaggtccggt cgcgctccga cgcttcgacc ctcgagccgg 120 
35 tcgcgggtat cccggcggcc gcgggacgat ggcgtggtgg cactgacagg cgcgggcggc 180 
37 tgccgagccc cgcggccggc atggcgggcc agttccgcag ctacgtgtgg gacccgttgc 240 
39 taatcctgtc gcagatcgta ctcatgcaga ccgtctacta tggctctctg ggcctgtggc 300 
41 tggcgctggt ggacgcgctg gtgcgcaagc ccgtccctgg accagatgtt cgacgcggag 360 
43 atcctgggct tctccacccc tccaggccgg ctctcaatga tgtccttcgt cctcaacgcc 42 0 
45 ctcacctgtg ccctgggctt gctgtacttc atccggcgag ggaagcagtg cctggatttc 480 
47 actgtcactg tgcatttctt tcacctcctg ggctgctggc tctacagctc ccgtttcccc 540 
49 tcggcgctga cctggtggct ggtccaggct gtgtgcattg cactcatggc cgtcatcggg 600 
51 gagtacctgt gcatgcggac ggagctcaag gagatccccc tcagctcagc ccctaagtcc 660 
53 aatgtctaga gttgggccct ttggacactc tgctggcact tgggccccat caccttgggc 720 
55 tgctcagacc tccagatggg gtctggccca agtctgagca gaaccctgga aatgtgaagt 780 
57 ctgttggtgg agagataatg aggtcccatc ataaaggcag gtagcagcca tgatcacaga 840 
59 tgtaagaatg gcctctgtct gccaaagcct tgatatctgg aggccagtaa gggacctcat 900 
61 ggagggtagt ggcagatttg gaaccatgtc acatgagcca tcatactgtc accagcctgt 960 
63 tattttaaaa agaaaaaaaa aaaatcaagg atatctgatt ggaataaacc actcttctcg 102 0 
65 ttgtctgtct tatgcccatg acagccagta cctttgctgt gttgccaaac cacagggatt 1080 
67 ctctgtggag aaatacctga tttctgggtc catagccaca gaaaaagatg taggtacaga 1140 
69 gtgctaggct gctgacagga cgtcgagggg aggaggcatc aagcacaaga aaaatgcatg 1200 
71 gccgtgccgt tagacacaca cacacacttt tgtgtgtgtc caggacccat gactgtctcc 1260 
73 ctccagttcc ctgtatggac tctgccttgc tgttgtcact cagcacagcc agagacagga 132 0 
75 cccagagaaa accccagcat ccctcccagc cttcccttca taataaaagc cattgtctgc 1380 
77 tctctggaag tgagcaggca gccagcttct actggacctc aactgtggca ggagtttctg 1440 
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79 tttgctgtct tttgagttct gtgataggga gggtgtacta aaggtgctgg aggctcaccc 1500 

81 tgctaagctt tcttccaagt ggtttcctca ggaagggctg gcagctgtcc ttcctaggta 1560 

83 cataaataca ctattttcca ate 1583 

86 <210> SEQ ID NO: 2 

87 <211> LENGTH: 2677 

88 <212> TYPE: DNA 

89 <213> ORGANISM: Homo sapiens 

91 *<400> SEQUENCE: 2 ' ' 

92 tetaggcegg cagcgcctct cctccatggt cctgtctgtc agcgctgttt tgggagcccg 60 
94 ccggtgaggc cgggccacgc tcagacactt egategtega gtctgtcact gggcatggcg 120 
96 ggtcagttcc geagctaegt gtgggacccg ctgctgatcc tgtegcagat cgtcctcatg 180 
98 cagaccgtgt attaeggetc gctgggcctg tggctggcgc tggtggacgg. getagtgega 240 
100 cagcccctcg ctggaccaga tgttcgacgc cgagatcctg ggcttttcca cccctccagg 300 
102 ccggctctcc atgatgtcct tcatcctcaa cgccctcacc tgtgccctgg gcttgctgta 360 
104 cttcatccgg cgaggaaagc agtgtctgga tttcactgtc actgtccatt tctttcacct 42 0 
106 cctgggctgc tggttctaca gctcccgttt cccctcggcg ctgacctggt ggctggtcca 480 
108 agccgtgtgc attgeactea tggctgtcat eggggagtae ctgtgcatgc ggaeggaget 540 
110 caaggagata cccctcaact cagcccctaa atccaatgtc tagaatcagg ccctttggac 600 
112 atectgetga cacttgggcc ccttaacacc t.tgggctgct cagac.cctcc agatgaggtc 660 
114 cagcccagat ctgagaggaa ccctggaaat gtgaagtctc tgttggtttg ggagagatag 720 
116 tgagggcctg tcaaagaagg caggtagcag tcagcatgac agetgeaaga atgacctctg 780 
118 tctgttgaag ccttggtatc tgagaggtca ggaaggggac ctctttgagg gtaataacag 840 
120 aattggaacc atgccactct tgagccacaa tacctgtcac cagcctgttg ttttaagaga 900 
122 gaaaaaaaat caaggatatc tgattggagc aaaccacttc tttagtcatc tgtcttaccc 960 
124 ccctgggaca gctgttacct ttgcagtgtt gccgaatcac agcagttacc tttgcagtgt 102 0 
126 tgecgaatea cagcagttct gttggagaaa cgcttggttt ccggatccag agecacagaa 1080 
128 agaaatgtag gtgtgaagta ttaggctget gtcagggaga ggatggcaga tggaggcatc 1140 
130 aagcacaagg aaaatgeaca acctgtgccc tgttatacac acgttcatgt gcacccaaga 1200 
132 acctatgact ttcttccagt tccttctacc aggtccccat cctgctgcca gctctcaaca 1260 
134 tagcaggeca taggacccag agaagaatcc cagegttget caaagtctaa ccatcataaa 1320 
136 gacactgcct gtcttctagg aatgaccagg cacccagctc ccactggact ccaatttttt 1380 
138 ttcctgcctt atttagaatt ctttggcggg aagggtatga tgggttccca gagacaagaa . 1440 
140 gcccaacctt ctggcctggg ctgtgctgat agtgctgagg gagataggaa tttgetgeta 1500 
142 agatttttct ttggggtgga gtttcctctg tgaggggctt gcagctatcc ttcctgtgta 1560 
144 tacaaataca gtattttcca tggttctgcc tgcacttact ttgtaatgcc aeggttgaga 162 0 
146 ttgagagaga tcagcgcagc caggcaaggg aactttaaag aattattagg ccaccttctc 1680 
148 cctttcctgg accccagagt cattcctcca tttggttaaa atactcagtg cagggaactc 1740 
150 ttacatcctg tctccttcac ttgcagcgtc ccctgctatg cctcaggtga accacataat 1800 
152 tcttgggttt ccgttcctac ttgctagtga tttctgaaca tgttcaatgg agcggcacac 1860 
154 agtctagacc cacttccgca ttgaaacctt cactgttcct ctttggtttc ttcagagctt 192 0 
156 tcccaagaga gctgtcagtt ttcagctgtc agtaacacaa atgagtttat ggtaacacaa 1980 
158 atgagttttg ctatctctct gagaagctca tctgacctcc tgactctcag ccctacagag 2040 
160 tagggagttg atgctgacag gatgaagatt taggaataaa tatgcctggg aagagactgg 2100 
162 gaaggttcta gggtgaggca cctcagtaac tcatggtacc ttggccaagt tggaaggaag 2160 
164 cagtttgtta atgaggcaca gtaatcctgg ctgcagggtc taggaggtaa gaccagctgg 222 0 
166 gatgaccttc cctgggttaa tcaatttccc tctagacaac acaaactgea ggcatgtgac 2280 
168 taactttgaa agaacaccca tcatgtggct gctgtcaccc ttgaccagcc gtggtggtgg 2340 
170 ttactccatc tgtggttgga gcgcctcttt gggattcact tcaaggtctt gtgcctattt 2400 
172 ttctgeatat cttctgtgat gacaaatctc tgtcccctga gtgttaattt gatttttaga 2460 
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174 aatggccaaa agtcacgtga tccaaacttt ttttcagtaa tatggagact gagctgcatg 
176 gtagttgggg atcaaaaata tgtgacctta atgagatttt tatgatttct aaagtaacaa 
178 taaaagcagt ttttagagtt gagttccaga gagggcaggg caatggcagt gacatgtttg 
180 tcattttaat aataaataac atctattgag tgcttaa 

183 <210> SEQ ID NO: 3 

184 <211> LENGTH: 453 

185 <212> TYPE : DNA 

186 <213> ORGANISM: Homo sapiens 

188 <400> SEQUENCE: 3 

189 atggcgggtc agttccgcag ctacgtgtgg gacccgctgc tgatcctgtc gcagatcgtc 
191 ctcatgcaga ccgtgtatta cggctcgctg ggcctgtggc tggcgctggt ggacgggcta 
193 gtgcgacagc ccctcgctgg accagatgtt cgacgccgag atcctgggct tttccacccc 
195 tccaggccgg ctctccatga tgtccttcat cctcaacgcc ctcacctgtg ccctgggctt 
197 gctgtacttc atccggcgag gaaagcagtg tctggatttc actgtcactg tccatttctt 
199 tcacctcctg ggctgctggt tctacagctc ccgtttcccc tcggcgctga cctggtggct 
201 ggtccaagcc gtgtgcattg cactcatggc tgtcatcggg gagtacctgt gcatgcggac 
2 03 ggagctcaag gagatacccc tcaactcagc ccc 

206 <210> SEQ ID NO: 4 

207 <211> LENGTH:- -156 ■ 
2 08 <212> TYPE: PRT 

209 <213> ORGANISM: Mus musculus 
211 <400> SEQUENCE: 4 



213 


Met 


Ala 


Gly 


Gin 


Phe 


Arg 


Ser 


Tyr Val Trp Asp Pro Leu Leu 


He Leu 


214 


1 








5 






10 


15 


217 


Ser 


Gin 


He 


Val 


Leu 


Met 


Gin 


Thr Val Tyr Tyr Gly Ser Leu 


Gly Leu 


218 








20 








25 30 




221 


Trp 


Leu 


Ala 


Leu 


Val 


Asp 


Ala 


Leu Val Arg Ser Ser Pro Ser 


Leu Asp 


222 






35 










40 45 




225 


Gin 


Met 


Phe 


Asp 


Ala 


Glu 


He 


Leu Gly Phe Ser Thr Pro Pro 


Gly Arg 


226 




50 










55 


60 




229 


Leu 


Ser 


Met 


Met 


Ser 


Phe 


Val 


Leu Asn Ala Leu Thr Cys Ala 


Leu Gly 


230 


65 










70 




75 


80 


233 


Leu 


Leu 


Tyr 


Phe 


He 


Arg 


Arg 


Gly Lys Gin Cys Leu Asp Phe 


Thr Val 


234 










85 






90 


95 


237 


Thr 


Val 


His 


Phe 


Phe 


His 


Leu 


Leu Gly Cys Trp Leu Tyr Ser 


Ser Arg 


238 








100 








105 110 




241 


Phe 


Pro 


Ser 


Ala 


Leu 


Thr 


Trp 


Trp Leu Val Gin Ala Val Cys 


He Ala 


242 






115 










120 125 




245 


Leu 


Met 


Ala 


Val 


He 


Gly 


Glu 


Tyr Leu Cys Met Arg Thr Glu 


Leu Lys 


246 




130 










135 


140 




249 


Glu 


He 


Pro 


Leu 


Ser 


Ser 


Ala 


Pro Lys Ser Asn Val 





253 <210> SEQ ID NO: 5 

254 <211> LENGTH: 125 

255 <212> TYPE: PRT 

256 <213> ORGANISM: Mus musculus 
258 <400> SEQUENCE: 5 

260 Met Ala Leu Trp Ala Cys Gly Trp Arg Trp Trp Thr Arg Trp Cys Ala 

261 1 5 10 15 



2520 
2580 
2640 
2677 



60 
120 
180 
240 
300 
360 
420 
453 



250 145 



150 



155 



f i le : / /C : \CRF4 \Outhold\Vsr J54 04 0 5 . htm 



2/21/2007 



RAW SEQUENCE LISTING 

PATENT APPLICATION: US/10/540,405 



DATE: 02/21/2007 
TIME: 14:18:33 



Input Set : E:\9301210999.txt 

Output Set: N:\CRF4\02212007\J540405.raw 



264 


Gin 


Pro 


vai 


Pro 


Gly 


Pro Asp Val Arg Arg Gly Asp Pro 




Leu 


Leu 


O £ c 








z u 








25 










268 


His 


Pro 


Ser 


Arg 


Pro 


Ala 


Leu Asn Asp Val Leu Arg Pro 




Arg 


Pro 


26 9 






35 










40 


45 








272 


His 


Leu 


Cys 


Pro 


Gly 


Leu 


Ala 


Val Leu His Pro Ala Arg 


IjiJLU 


ax a 


Val 


273 




50 










55 




60 








27.6 


Pro 


Gly 


Phe 


His 


Cys 


His Cys Ala Phe Leu Ser, Pro Pro 


(jr±y 


Leu 


Leu 


277 


65* 










70 




75 








oU 


280 


Ala 


Leu 


Gin 


Leu 


Pro 


Phe 


Pro Leu Gly Ala Asp Leu Val 


J\±3. 


\j±y 


Pro 


281 










85 






90 










284 


Gly 


Cys 


Val 


His 


Cys 


Thr His Gly Arg His Arg Gly Val 


Pro 


XT—. "1 

vai 


TT -i ~ 

tiXS 


285 








100 








105 




110 






288 


Ala 


Asp 


Gly 


Ala 


Gin 


Gly Asp 


Pro Pro Gin Leu 


Ser Pro 








289 






115 










120 


125 








292 


<210> SEQ ID NO: 


: 6 
















293 


<211> LENGTH: 151 
















294 


<212> TYPE: 


PRT 


















295 


<213> ORGANISM: 


Mus 


musculus 










297 


<400> SEQUENCE: 


6 
















2 99 


Met 


Ala 


Gly 


Gin 


Phe 


Arg 


Ser 


Tyr Val Trp Asp Pro Leu 


Leu 


lie 


Leu 


300 


1 








5 






10 






15 




303 


Ser 


Gin 


He 


Val 


Leu 


Met 


Gin 


Thr Val Tyr Tyr 


Gly Ser 


Leu 


Gly 


Leu 


304 








20 








25 




30 






307 


Trp 


Leu 


Ala 


Leu 


Val 


Asp 


Ala 


Leu Val Arg Lys 


Pro Val 


Pro 


Gly 


Pro 


308 






35 










40 


45 








311 


Asp 


Val 


Arg 


Arg 


Gly 


Asp 


Pro 


Gly Leu Leu His 


Pro Ser 


Arg 


Pro 


Ala 


312 




50 










55 




60 








315 


Leu 


Asn 


Asp 


Val 


Leu 


Arg 


Pro 


Gin Arg Pro His 


Leu Cys 


Pro 


Gly 


Leu 


316 


65 










70 




75 








oU 


319 


Ala 


Val 


Leu 


His 


Pro 


Ala 


Arg 


Glu Ala Val Pro 


Gly Phe 


TT -,' — 

HIS 


Cys 


TT -! t~i 

his 


320 










85 






90 










323 


Cys 


Ala 


Phe 


Leu 


Ser 


Pro 


Pro 


Gly Leu Leu Ala 


Leu Gin 


Leu 


Pro 


rfie 


324 








100 








105 




inn 






327 


Pro 


Leu 


Gly 


Ala 


Asp 


Leu 


Val 


Ala Gly Pro Gly 


Cys Val 




Cys 


Tnr 


328 






115 










120 


125 








331 


His 


Gly 


Arg 


His 


Arg 


Gly 


Val 


Pro Val His Ala 


Asp Gly 


Ala 


lain 


uiy 


332 




130 










135 




140 








'X'X ^ 


Asp 


Pro 


Pro 


Gin 


Leu 


Ser 


Pro 












336 


145 










150 














339 


<210> SEQ ID NO: 


: 7 
















340 


<211> LENGTH: 156 
















341 


<212> TYPE: 


PRT 


















342 


<213> ORGANISM: 


Mus 


musculus 










344 


<400> SEQUENCE: 


7 
















346 


Met Ala Gly Gin Phe 


Arg Ser Tyr Val Trp Asp 


Pro Leu 


Leu 


He 


Leu 


347 


1 








5 






10 






15 




350 


Ser 


Gin 


He 


Val 


Leu 


Met 


Gin 


Thr Val Tyr Tyr 


Gly Ser 


Leu 


Gly 


Leu 


351 








20 








25 




30 






354 


Trp Leu Ala Leu Val 


Asp Ala Leu Val Arg Ser 


Ser Pro 


Ser 


Leu 


Asp 
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355 


35 








40 




45 






358 


Gin Met Phe Asp 


Ala 


Glu 


He 


Leu Gly Phe 


Ser Thr 


Pro 


Pro 


Gly Arg 


359 


50 






55 




60 








362 


Leu Ser Met Met 


Ser 


Phe 


Val 


Leu Asn Ala 


Leu Thr 


Cys 


Ala 


Leu Gly 


363 


65 




70 






75 






80 


366 


Leu Leu Tyr Phe 


He 


Arg 


Arg 


Gly Lys Gin 


Cys Leu 


Asp 


Phe 


Thr Val 


367 




85 






90 








95 


370 


Thr Val His Phe 


Phe 


His 


Leu N 


Leu "Gly Cys 


Trp Leu 


Tyr 


Ser 


Ser Arg 


371 


100 








105 






110 




374 


Phe Pro Ser Ala 


Leu 


Thr Trp 


Trp Leu Val 


Gin Ala 


Val 


Cys 


He Ala 


375 


115 








120 




125 






378 


Leu Met Ala Val 


He 


Gly Glu 


Tyr Leu Cys 


Met Arg 


Thr 


Glu 


Leu Lys 


379 


130 






135 




140 








382 


Glu Val Pro Leu 


Ser 


Ser 


Ala 


Pro Lys Ser 


Asn Val 








383 


145 




150 






155 








386 


<210> SEQ ID NO: 


: 8 
















387 


<211> LENGTH: 151 
















388 


<212> TYPE: PRT 


















389 


<213> ORGANISM: 


Homo sapiens 










391 


<400> SEQUENCE: 


8 
















393 


Met Ala Gly Gin 


Phe 


Arg 


Ser 


Tyr Val Trp 


Asp Pro 


Leu 


Leu 


He Leu 


394 


1 


5 






10 








15 


397 


Ser Gin He Val 


Leu 


Met 


Gin 


Thr Val Tyr 


Tyr Gly 


Ser 


Leu 


Gly Leu 


398 


20 








25 






30 




401 


Trp Leu Ala Leu 


Val 


Asp 


Gly 


Leu Val Arg 


Gin Pro 


Leu 


Ala 


Gly Asp 


402 


35 








40 




45 






405 


Pro Val Arg Arg 


Arg 


Asp 


Pro 


Gly Leu Phe 


His Pro 


Ser 


Arg 


Pro Ala 


406 


50 






55 




60 








409 


Leu His Asp Val 


Leu 


His 


Pro 


Gin Arg Pro 


His Leu 


Cys 


Pro 


Gly Leu 


410 


65 




70 






75 






80 


413 


Ala Val Leu His 


Pro 


Ala 


Arg 


Lys Ala Val 


Ser Gly 


Phe 


His 


Cys His 


414 




85 






90 








95 


417 


Cys Pro Phe Leu 


Ser 


Pro 


Pro 


Gly Leu Leu 


Val Leu 


Gin 


Leu 


Pro Phe 


418 


100 








105 






110 




421 


Pro Leu Gly Ala 


Asp 


Leu 


Val 


Ala Gly Pro 


Ser Arg 


Val 


His 


Cys Thr 


422 


115 








120 




125 






425 


His Gly Cys His 


Arg 


Gly 


Val 


Pro Val His 


Ala Asp 


Gly 


Ala 


Gin Gly 


426 


130 






135 




140 








429 


Asp Thr Pro Gin 


Leu 


Ser 


Pro 












430 


145 




150 














433 


<210> SEQ ID NO: 


: 9 
















434 


<211> LENGTH: 3685 
















435 


<212> TYPE: DNA 


















436 


<213> ORGANISM: 


Mus 


mus cuius 










439 


<220> FEATURE: 


















440 


<221> NAME /KEY: 


CDS 
















441 


<222> LOCATION: 


(52) 


. . (3195) 












443 


<22 0> FEATURE: 


















444 


<221> NAME/KEY: 


misc_f eature 
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